Gaussian Map based Acoustic Model Adaptation Using Untranscribed Data for Speech Recognition in Severely Adverse Environments
نویسندگان
چکیده
This study proposes an acoustic model adaptation scheme to improve speech recognition in severely adverse environments utilizing untranscribed data. In the proposed method, a clean GMM is estimated from clean training data, and a noisecorrupted GMM is obtained by MAP adaptation over the adaptation data. The Gaussian component of the adapted HMMs is obtained using the transform of the most similar Gaussian component of the GMM. The proposed mixture-selective model adaptation method is evaluated using an LDC corpus which represents severely adverse communication channel environments. The experimental results show the proposed adaptation method is comparable or improves performance compared to conventional MLLR adaptation. The proposed method is also effective at improving speech recognition using independent adaptation data sets. Performance results demonstrate that the proposed adaptation method is significantly more effective at improving speech recognition in severely noise conditions, where transcribed data is unavailable and baseline ASR fails to accurately transcribe the adaptation data due to acoustic condition mismatch.
منابع مشابه
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملCross-lingual acoustic model adaptation based on transfer vector field smoothing with MAP
We propose a method to adapt acoustic models for robust speech recognition in real environments using data from other languages. In real-world speech recognition systems, we can effectively adapt acoustic models using the speech data logged by the system. However, when developing a system for a new language, this step is impossible since we have no such speech data for it. Assuming that similar...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملUncertainty-based learning of acoustic models from noisy data
We consider the problem of acoustic modeling of noisy speech data, where the uncertainty over the data is given by a Gaussian distribution. While this uncertainty has been exploited at the decoding stage via uncertainty decoding, its usage at the training stage remains limited to static model adaptation. We introduce a new Expectation Maximisation (EM) based technique, which we call uncertainty...
متن کامل